# Median Selection with Structural and Noisy Information

# Neurips 2025 submission (Anonymous Submission)

This repository contains the implementation of our NeurIPS 2025 submission, which proposes efficient algorithms for median selection using structural information (DAGs) and noisy comparisons. The code is provided for reproducibility evaluation.

We provide three core algorithms:

* **rand**: Randomized selection using structural information (DAG + chain decomposition)
* **det**: Deterministic selection using structural information
* **vote**: Voting-based Lazy selection that combines weak and strong oracles

## Requirements

* C++17 or later
* CMake >= 3.10

## Build Instructions

From the root directory:

```bash
mkdir build
cd build
cmake ..
make
```

This will produce an executable named `main` in the `build/` directory.

## Running the Code

The program supports the following command-line options:

| Flag | Algorithms | Description                                                 |
| ---- | ---------- | ----------------------------------------------------------- |
| `-n` | all        | Number of elements in the input array                       |
| `-m` | rand, det  | Number of edges in the DAG (used to control DAG width)      |
| `-k` | rand, det  | Rank of the target element (e.g., `n/2` for median)         |
| `-c` | vote       | Voting repetition constant (controls number of comparisons) |
| `-e` | vote       | Error rate of the weak oracle (e.g., `0.1` = 90% accuracy)  |
| `-a` | all        | Algorithm to run: `rand`, `det`, or `vote`                  |

All outputs are printed directly to the terminal.

### Example Usages

**Randomized Selection (DAG-based):**

```bash
./main -n 10000 -m 40000 -k 5000 -a rand
```

**Deterministic Selection (DAG-based):**

```bash
./main -n 10000 -m 40000 -k 5000 -a det
```
Note: For rand and det, the input DAG is randomly generated with n nodes and m edges.
The number of edges m must not exceed the number of possible edges in a DAG:
m ≤ n * (n−1)/2 (i.e., m must be less than or equal to n * (n - 1) / 2).


**Voting-Based Lazy Selection:**

```bash
./main -n 10000 -c 4 -e 0.1 -a vote
```

In the `vote` mode, `k` is not required and is internally set to `sqrt(n)`.

## Output Format

Each run prints output lines like:

```
[vote] result: 4999, weak: 124, strong: 5, correct: 1
[rand] result: 4999, width: 47, comparisons: 21500
[det] result: 4999, width: 47, comparisons: 19800
```

* `result`: Index of selected element
* `weak` / `strong`: Number of oracle calls used
* `width`: DAG width (difference between `n` and max matching)
* `correct`: Whether result matches true median

## Implementation Notes

* DAGs are generated as complete DAGs with `n` nodes and `m` edges.
* For `rand` and `det`, the DAG is converted to a bipartite graph and matched.
* Chains are computed via decomposition and used in selection.
  Note: Our implementation uses a straightforward chain decomposition method, which is slower than the state-of-the-art approach assumed in the paper.
  Since our experiments focus solely on query complexity (number of comparisons), this difference does not affect the validity of the reported results.

* The `vote` algorithm uses sampling and adaptive voting with weak and strong oracles.
* Sort comparisons are also counted as strong oracle calls in `vote`.

## File Structure

```
code/
├── include/              # Header files
│   ├── DAG.hpp
│   ├── bipartite.hpp
│   ├── prediction.hpp
│   └── Lazy.hpp
├── src/                  # Source files
│   ├── main.cpp
│   ├── DAG.cpp
│   ├── bipartite.cpp
│   ├── prediction.cpp
│   └── lazy.cpp
├── CMakeLists.txt        # Build file
└── README.md             # This file
```

## Notes

* This code is anonymized. No identifying information (e.g., author names, affiliations, emails) is included.
* All data is synthetic and generated within the code.
* Randomness is used in sampling and DAG construction.
* All experiments are deterministic given the same random seed.
* The `rand` and `det` algorithms have been tested with up to n = 10^4 elements.
* The `vote` (lazy selection) algorithm has been tested with up to n = 10^7 elements.

If you encounter any reproducibility issues, please contact us through the OpenReview comment system anonymously.
